A Novel Excitation Approach for HMM-Based Speech Synthesis
نویسنده
چکیده
One of the drawbacks to the speech synthesis technique wherein speech parameters are directly generated from hidden Markov models (HMM-based speech synthesis) is the unnaturalness of the synthesized speech. This problem occurs owing to the rough excitation model employed during the waveform generation stage. This report introduces a new excitation approach that attempt to solve this problem. The proposed scheme consists in feeding the mel log spectrum approximation (MLSA) filter with mixed excitation, obtained through a set of state-dependent filters. The filters are derived from the speech database through a closed-loop procedure where the likelihood of the residual is maximized.
منابع مشابه
An excitation model for HMM-based speech synthesis based on residual modeling
This paper describes a trainable excitation approach to eliminate the unnaturalness of HMM-based speech synthesizers. During the waveform generation part, mixed excitation is constructed by state-dependent filtering of pulse trains and white noise sequences. In the training part, filters and pulse trains are jointly optimized through a procedure which resembles analysis-bysynthesis speech codin...
متن کاملA trainable excitation model for HMM-based speech synthesis
This paper introduces a novel excitation approach for speech synthesizers in which the final waveform is generated through parameters directly obtained from Hidden Markov Models (HMMs). Despite the attractiveness of the HMM-based speech synthesis technique, namely utilization of small corpora and flexibility concerning the achievement of different voice styles, synthesized speech presents a cha...
متن کاملExcitation modeling based on waveform interpolation for HMM-based speech synthesis
It is generally known that a well-designed excitation produces high quality signals in hidden Markov model (HMM)-based speech synthesis systems. This paper proposes a novel techniques for generating excitation based on the waveform interpolation (WI). For modeling WI parameters, we implemented statistical method like principal component analysis (PCA). The parameters of the proposed excitation ...
متن کاملStatistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis
In our previous study, we proposed the waveform interpolation (WI) approach to model the excitation signals for hidden Markov model (HMM)-based speech synthesis. This letter presents several techniques to improve excitation modeling within the WI framework. We propose both the time domain and frequency domain zero padding techniques to reduce the spectral distortion inherent in the synthesized ...
متن کاملHMM-based Speech Synthesis with an Acoustic Glottal Source Model
A major cause of degradation of speech quality in HMMbased speech synthesis is the use of a simple delta pulse signal to generate the excitation of voiced speech. This paper describes a new approach to using an acoustic glottal source model in HMM-based synthesisers. The goal is to improve speech quality and parametric flexibility to better model and transform voice characteristics.
متن کامل